Modern multicore chips show complex behavior with respect to performance andpower. Starting with the Intel Sandy Bridge processor, it has become possibleto directly measure the power dissipation of a CPU chip and correlate this datawith the performance properties of the running code. Going beyond a simplebottleneck analysis, we employ the recently published Execution-Cache-Memory(ECM) model to describe the single- and multi-core performance of streamingkernels. The model refines the well-known roofline model, since it can predictthe scaling and the saturation behavior of bandwidth-limited loop kernels on amulticore chip. The saturation point is especially relevant for considerationsof energy consumption. From power dissipation measurements of benchmarkprograms with vastly different requirements to the hardware, we derive asimple, phenomenological power model for the Sandy Bridge processor. Togetherwith the ECM model, we are able to explain many peculiarities in theperformance and power behavior of multicore processors, and derive guidelinesfor energy-efficient execution of parallel programs. Finally, we show that theECM and power models can be successfully used to describe the scaling and powerbehavior of a lattice-Boltzmann flow solver code.
展开▼